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TITLE OF THE INVENTION 
SPEECH INPUT TERMINAL, SPEECH RECOGNITION APPARATUS, 
SPEECH COMMUNICATION SYSTEM, AND 
SPEECH COMMUNICATION METHOD 

5 

FIELD OF THE INVENTION 
The present invention relates to a speech input 
terminal, speech recognition apparatus, speech 
communication system, and speech communication method, 
10 which are used to transmit speech data through a 

communication network and execute speech recognition. 

BACKGROUND OF THE INVENTION 
A speech communication system is proposed, in which 
15 speech data is sent from a speech input terminal such as 
a portable telephone to a host server through a 
communication network, and processing for retrieval of 
specific information and the like are executed. In such 
a speech communication system, since data can be 
20 transmitted/received by speech, operation can be 
facilitated. 

However, speech data fluctuate depending on the 
characteristics of a speech input terminal such as a 
portable telephone itself, the surrounding environment, 
25 and the like, and hence satisfactory speech recognition may 
not be performed. 
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In addition, since communication is performed under 
the same communication conditions under any circumstances, 
high communication efficiency cannot always be ensured. 

5 SUMMARY OF THE INVENTION 

The present invention has been made in consideration 
of the situation associated with a speech input terminal, 
and has as its object to provide a speech input terminal, 
speech recognition apparatus, speech communication system, 

10 and speech communication method which can implement optimal 
speech recognition or communication. 

According to the present invention, there is provided 
a speech input terminal for transmitting speech data to a 
speech recognition apparatus through a wire or wireless 

15 communication network, comprising speech input means, means 
for creating information for speech recognition, which is 
unique to the speech input terminal or represents an 
operation state thereof, and communication means for 
transmitting the information to the speech recognition 

20 apparatus. 

In the present invention, the information is 
information unique to the speech input terminal or 
information about the surrounding environment or operation 
state associated with the speaker himself /herself . For 

25 example, the information includes the characteristics of the 
speech input terminal itself, e.g., the characteristics of 
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a microphone for speech input, information about the 
surrounding environment in which the speech input terminal 
is used, or the speech features of the person using the speech 
input terminal. This information also includes information 
5 obtained by performing acoustic analysis processing for the 
original data obtained from the input means. 

The speech input terminal of the present invention can 
further comprise means for, when a data conversion condition 
for communication based on the information is received from 

10 the speech recognition apparatus, converting the speech data 
on the basis of the conversion condition. 

The speech input terminal of the present invention can 
further comprise means for storing the information, means 
for determining whether there has been a change in the 

15 information in each communication, and means for, when there 
has been no change in the information, notifying the speech 
recognition apparatus of the corresponding information. 

In the speech input terminal of the present invention, 
the terminal further comprises means for creating a speech 

2 0 recognition model on the basis of the information, and the 
communication means can transmit the information and/or the 
speech recognition model to the speech recognition 
apparatus . 

According to the present invention, there is provided 
25 a speech recognition apparatus comprising speech 

recognition means for executing speech recognition 
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processing for speech data transmitted from a speech input 
terminal through a wire or wireless communication network, 
and means for receiving information for speech recognition, 
which is unique to the speech input terminal or represents 
5 an operation state thereof from the speech input terminal, 
wherein said speech recognition means executes speech 
recognition processing on the basis of the information. 

According to the present invention, there is provided 
a speech recognition apparatus for executing speech 

10 recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising means for creating 
information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 

15 thereof, on the basis of the transmitted speech data, and 
means for executing speech recognition processing on the 
basis of the information. 

The speech recognition apparatus of the present 
invention can further comprise means for creating a speech 

20 recognition model on the basis of the information. 

According to the present invention, there is provided 
a speech recognition apparatus for executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 

25 communication network comprising means for receiving 

information for speech recognition, which is unique to the 
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speech input terminal or represents an operation state 
thereof from the speech input terminal, means for 
determining a data conversion condition for communication 
on the basis of the information, and means for transmitting 
5 the data conversion condition to the speech input terminal. 

According to the present invention, there is provided 
a speech recognition apparatus for executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 

10 communication network comprising means for creating 

information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 
thereof, on the basis of the transmitted speech data, means 
for determining a data conversion condition for 

15 communication on the basis of the information, and means for 
transmitting the data conversion condition to the speech 
input terminal. 

In the speech recognition apparatus of the present 
invention, the data conversion condition can include a data 

20 conversion condition based on a quantization table created 
on the basis of the information, 

The speech recognition apparatus of the present 
invention can further comprise means for, when the speech 
input terminal comprises a plurality of speech input 

25 terminals, storing the information in correspondence with 
each of the speech input terminals. 
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The speech recognition apparatus of the present 
invention can further comprise means for, when the speech 
input terminal comprises a plurality of speech input 
terminals, storing the speech recognition model in 
5 correspondence with each of the speech input terminals. 

The speech recognition apparatus of the present 
invention can further comprise means for, when the speech 
input terminal comprises a plurality of speech input 
terminals, storing the data conversion condition in 

10 correspondence with each of the speech input terminals. 

According to the present invention, there is provided 
a speech communication system comprising a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 

15 communication network wherein the speech input terminal 
comprises speech input means, means for creating information 
for speech recognition, which is unique to the speech input 
terminal or represents an operation state thereof, and 
communication means for transmitting the information to the 

20 speech recognition apparatus, and the speech recognition 
apparatus comprises means for executing speech recognition 
processing on the basis of the information. 

According to the present invention, there is provided 
a speech communication system comprising a speech input 

25 terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
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communication network wherein the speech recognition 
apparatus comprises means for creating information for 
speech recognition, which is unique to the speech input 
terminal or represents an operation state thereof, on the 
5 basis of speech data from the speech input terminal, and means 
for executing speech recognition processing on the basis of 
the information. 

According to the present invention, there is provided 
a speech communication system comprising a speech input 

10 terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network wherein the speech input terminal 
comprises speech input means, means for creating information 
for speech recognition, which is unique to the speech input 

15 terminal or represents an operation state thereof, and 

communication means for transmitting the information to the 
speech recognition apparatus, and the speech recognition 
apparatus comprises means for determining a data conversion 
condition for communication on the basis of the information, 

20 and means for transmitting the data conversion condition to 
the speech input terminal. 

According to the present invention, there is provided 
a speech communication system comprising a speech input 
terminal and a speech recognition apparatus which can 

25 communicate with each other through a wire or wireless 
communication network wherein the speech recognition 
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apparatus comprises means for creating information for 
speech recognition, which is unique to the speech input 
terminal or represents an operation state thereof, on the 
basis of speech data from the speech input terminal, means 
5 for determining a data conversion condition for 

communication on the basis of the information, and means for 
transmitting the data conversion condition to the speech 
input terminal. 

According to the present invention, there is provided 

10 a speech communication method of transmitting speech data 
from a speech input terminal to a speech recognition 
apparatus through a wire or wireless communication network 
comprising in the speech input terminal, the step of creating 
information for speech recognition, which is unique to the 

15 speech input terminal or represents an operation state 

thereof, and the step of transmitting the information to the 
speech recognition apparatus. 

According to the present invention, there is provided 
a speech communication method of executing speech 

20 recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising the step of receiving 
information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 

25 thereof from the speech input terminal, and the step of 
executing speech recognition processing on the basis of the 
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According to the present invention, there is provided 
a speech communication method of executing speech 
recognition processing for speech data transmitted from a 
5 speech input terminal through a wire or wireless 

communication network comprising the step of creating 
information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 
thereof, on the basis of data transmitted from the speech 

10 input terminal, and the step of executing speech recognition 
processing on the basis of the information. 

According to the present invention, there is provided 
a speech communication method of executing speech 
recognition processing for speech data transmitted from a 

15 speech input terminal through a wire or wireless 

communication network comprising the step of receiving 
information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 
thereof from the speech input terminal, the step of 

2 0 determining a data conversion condition for communication 
on the basis of the information, and the step of transmitting 
the data conversion condition to the speech input terminal. 

According to the present invention, there is provided 
a speech communication method of executing speech 

2 5 recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
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communication network comprising the step of creating 
information for speech recognition, which is unique to the 
speech input terminal or represents an operation state 
thereof, on the basis of data transmitted from the speech 
5 input terminal, the step of determining a data conversion 
condition for communication on the basis of the information, 
and the step of transmitting the data conversion condition 
to the speech input terminal. 

According to the present invention, there is provided 

10 a speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network comprising, in the speech input 
terminal, the step of creating information for speech 

15 recognition, which is unique to the speech input terminal 
or represents an operation state thereof, and the step of 
transmitting the information to the speech recognition 
apparatus, and, in the speech recognition apparatus, the 
step of executing speech recognition processing on the basis 

20 of the information. 

According to the present invention, there is provided 
a speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 

25 communication network comprising, in the speech recognition 
apparatus, the step of creating information for speech 



- 10 - 



recognition, which is unique to the speech input terminal 
or represents an operation state thereof, on the basis of 
speech data from the speech input terminal, and the step of 
executing speech recognition processing on the basis of the 
5 information. 

According to the present invention, there is provided 
a speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 

10 communication network comprising, in the speech input 
terminal, the step of creating information for speech 
recognition, which is unique to the speech input terminal 
or represents an operation state thereof, and the step of 
transmitting the information to the speech recognition 

15 apparatus, and, in the speech recognition apparatus, the 
step of determining a data conversion condition for 
communication on the basis of the information; and the step 
of transmitting the data conversion condition to the speech 
input terminal. 

20 According to the present invention, there is provided 

a speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network comprising, in the speech recognition 

25 apparatus, the step of creating information for speech 

recognition, which is unique to the speech input terminal 
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or represents an operation state thereof, on the basis of 
speech data from the speech input terminal, the step of 
determining a data conversion condition for communication 
on the basis of the information; and the step of transmitting 
5 the data conversion condition to the speech input terminal. 

According to the present invention, there is provided 
a storage medium recording a program for, in order to transmit 
speech data from a speech input terminal to a speech 
recognition apparatus through a wire or wireless 

10 communication network, causing a computer to function as 
means for creating information for speech recognition, which 
is unique to the speech input terminal or represents an 
operation state thereof, and communication means for 
transmitting the information to the speech recognition 

15 apparatus . 

According to the present invention, there is provided 
a storage medium recording a program for, in order to execute 
speech recognition processing on the basis of speech data 
sent from a speech input terminal through a wire or wireless 

20 communication network, causing a computer to function as 
means for receiving information for speech recognition, 
which is unique to the speech input terminal or represents 
an operation state thereof from the speech input terminal, 
and means for executing speech recognition processing on the 

25 basis of the information. 

According to the present invention, there is provided 
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a storage medium recording a program for, in order to execute 
speech recognition processing on the basis of speech data 
sent from a speech input terminal through a wire or wireless 
communication network, causing a computer to function as 
5 means for creating information for speech recognition, which 
is unique to the speech input terminal or represents an 
operation state thereof, on the basis of the speech data 
transmitted from the speech input terminal, and means for 
executing speech recognition processing on the basis of the 

10 information. 

According to the present invention, there is provided 
a storage medium recording a program for, in order to execute 
speech recognition processing on the basis of speech data 
sent from a speech input terminal through a wire or wireless 

15 communication network, causing a computer to function as 
means for receiving information for speech recognition, 
which is unique to the speech input terminal or represents 
an operation state thereof from the speech input terminal, 
and means for determining a data conversion condition for 

2 0 communication on the basis of the information, and means for 
transmitting the data conversion condition to the speech 
input terminal. 

According to the present invention, there is provided 
a storage medium recording a program for, in order to execute 

25 speech recognition processing on the basis of speech data 
sent from a speech input terminal through a wire or wireless 
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communication network, causing a computer to function as 
means for creating information for speech recognition, which 
is unique to the speech input terminal or represents an 
operation state thereof, on the basis of the speech data 
5 transmitted from the speech input terminal, means for 

determining a data conversion condition for communication 
on the basis of the information, and means for transmitting 
the data conversion condition to the speech input terminal. 
Other features and advantages of the present 
10 invention will be apparent from the following description 
taken in conjunction with the accompanying drawings, in 
which like reference characters designate the same or 
similar parts throughout the figures thereof. 

15 BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings, which are incorporated in 
and constitute a part of the specification, illustrate 
embodiments of the invention and, together with the 
description, serve to explain the principles of the 
20 invention . 

Fig. 1 is a block diagram showing the arrangement of 
a speech communication system according to an embodiment 
of the present invention; and 

Fig. 2 is a flow chart showing the processing 
25 performed by the speech communication system according to 
the embodiment . 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 
Preferred embodiments of the present invention will 
now be described in detail in accordance with the 
5 accompanying drawings . 

Fig. 1 is a block diagram showing the arrangement of 
a speech communication system according to an embodiment 
of the present invention. 

The speech communication system is comprised of a 
10 portable terminal 100 serving as a speech input terminal, 
a main body 200 serving as a speech recognition apparatus, 
and a communication line 300 for connecting these 
components to allow them to communicate with each other. 

The portable terminal 100 includes an input/output 
15 unit 101 for inputting/outputting speech, a communication 
control unit 102 for executing communication processing 
with the main body 200, an acoustic processing unit 103 for 
performing acoustic processing for the input speech, an 
environment information creation unit 104 for creating 
20 information unique to the portable terminal 100 or 

information indicating its operation state (to be referred 
to as environment information hereinafter in this 
embodiment) , and a speech communication information 
creation unit 105 . 
25 The main body 200 includes an environment adaptation 

unit 201 for performing processing based on the environment 
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information of the portable terminal 100, a communication 
control unit 202 for executing communication processing 
with the portable terminal 100, a speech recognition unit 
203 for executing speech recognition processing for speech 
5 data from the portable terminal 100, a speech communication 
information creation unit 204 for setting data conversion 
conditions for communication, a speech recognition model 
holding unit 205, and an application 206. 

The sequence of operation of the speech communication 
10 system having the above arrangement will be described next 
with reference to Fig. 2. Fig. 2 is a flow chart showing 
the processing performed by the speech communication 
system. 

The processing performed by the speech communication 
15 system is constituted by an initialization mode of 

analyzing environment information and a speech recognition 
mode of communicating speech data. 

In step S401, all processes are started. Information 
for the start of processing is sent from the input/output 
20 unit 101 to the communication control unit 202 of the main 
body 200 through the communication control unit 102. 

In step S402, a message is selectively sent from the 
speech recognition unit 203 or application 206 to the 
portable terminal 100. When, for example, supervised 
25 speaker adaptation based on environment information is to 
be performed, a list of contents to be read aloud by a user 
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is sent and output as a message (speech or characters) from 
the input/output unit 101 of the portable terminal 100. 
When microphone adaptation based on environment 
information is to be performed, information for prompting 
5 the utterance of speech for a few seconds may be output as 
a message from the input/output unit 101 of the portable 
terminal 100. On the other hand, when noise adaptation 
based on environment information is to be performed, step 
S402 may be skipped. 

10 In step S403, speech data (containing noise) is 

entered from the input/output unit 101 to create 
environment information in the portable terminal portable 
terminal 100. 

In step S404, the acoustic processing unit 103 

15 acoustically analyzes the entered speech data. If the 
environment information is to be converted into a model 
(average, variance, or phonemic model) , the information is 
sent to the environment information creation unit 104. 
Otherwise, the acoustic analysis result is sent from the 

20 communication control unit 102 to the main body. Note that 
the speech data may be directly sent without performing any 
acoustic analysis to the main body to be subjected to 
analysis and the like on the main body 200 side. 

When the environment information is converted into 

25 a model in step S404, the flow advances to step S405 to cause 
the environment information creation unit 104 to create 
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environment information. For the purpose of noise 
adaptation, for example, environment information is 
created by detecting a non-speech interval and obtaining 
the average and variance of parameters in the interval . For 
5 the purpose of microphone adaptation, environment 
information is created by obtaining the average and 
variance of parameters in a speech interval. For the 
purpose of speaker adaptation, a phonemic model or the like 
is created. 

10 In step S406, the created environment information 

model, acoustic analysis result, or speech is sent from the 
communication control unit 102 to the main body 200. 

In step S407, the main body 200 receives sent the 
environment information through the communication control 

15 unit 202. 

In step S408, the environment adaptation unit 201 
performs environment adaptation with respect to a speech 
recognition model in the speech recognition model holding 
unit 205 on the basis of the environment information to 

20 update the speech recognition model into an environment 
adaptation speech recognition model. This model is then 
held by the speech recognition model holding unit 205. 

As a method for environment adaptation, for example, 
a PMC technique can be used, which creates an environment 

25 adaptation speech recognition model from a noise model and 
speech recognition model. In the case of microphone 
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adaptation, for example, a CMS technique can be used, which 
creates an environment adaptive speech recognition model 
by using the average of speech for adaptation and a speech 
recognition model. 
5 In the case of speaker adaptation, for example, a 

method of creating a speaker adaptation model by using a 
speaker adaptation model and speech recognition model can 
be used. If a speech or acoustic analysis result is sent 
instead of an environment information model, a method of 

10 converting environment information into a model and further 
performing adaptation on the main body 200 side can be used. 
Alternatively, a method of performing environment 
adaptation by directly using a speech or acoustic analysis 
result, EM learning technique, VFS speaker adaptation 

15 technique, or the like can be used as an environment 
adaptation method. Creating an environment adaptive 
speech recognition model can improve recognition 
performance . 

Obviously, a speech recognition model may be created 
20 on the portable terminal 100 side and sent to the main body 
200 to be used. 

In step S409, in order to improve the communication 
efficiency of speech recognition, the speech communication 
information creation unit 204 performs environment 
25 adaptation for a table for the creation of communication 
speech information. A method of creating a scalar 
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quantization table of parameters of the respective 
dimensions which are used for speech recognition by using 
the distribution of environment adaptive speech 
recognition models will be described below. As this method, 
5 various methods can be used. The simplest method is a 
method of searching 3 a of the respective dimensions for 
the maximum and minimum values, and dividing the interval 
therebetween into equal portions. 

The number of quantization points may be decreased 
10 by a method of merging all distributions into one 

distribution, searching 3a (e.g., a range in which most 
of samples appearing in a Gauss distribution are included) 
for the maximum and minimum values, and dividing the 
interval therebetween into equal portions. 
15 As a more precise method, for example, a method of 

assigning quantization points in accordance with the bias 
of all distributions may be used. In this method, since 
a scalar quantization table of the respective dimensions 
is created by using the distribution of environment 
20 adaptive speech recognition models, the bit rate for 
communication can be decreased without degrading the 
recognition performance, thus allowing efficient 
communication . 

In step S410, the created scalar quantization table 
25 is transmitted to the portable terminal 100. 

In step 411, the created scalar quantization table 
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is received by the portable terminal 100 and stored in the 
speech communication information creation unit 105. 

With the above operation, the initialization mode is 
terminated. If a plurality of portable terminals 100 are 
5 present, the main body 200 can store data such as 

environment information, speech recognition models, and 
quantization tables in units of portable terminals. 

The flow then shifts to the speech recognition mode. 
In step S412, speech is input through the 
10 input/output unit 101. 

In step S413, the input speech data is acoustically 
analyzed by the acoustic processing unit 103, and the 
resultant data is sent to the speech communication 
information creation unit 105. 
15 in step S414, the speech communication information 

creation unit 105 performs scalar quantization of the 
acoustic analysis result on the speech data by using a 
scalar quantization table, and encodes the data as speech 
communication information. The encoded speech data is 
20 transmitted to the main body 200 through the communication 
control unit 102. 

In step S415, the main body 200 causes the speech 
recognition unit 203 to decode the received speech data, 
execute speech recognition processing, and output the 
25 recognition result . Obviously, in this speech recognition 
processing, the previously created speech recognition 
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model is used. 

In step S416, the speech recognition result is 
interpreted by the application 206 to obtain an application 
corresponding to the result, and the application result is 
5 sent to the communication control unit 202. 

In step S417, the application result is sent to the 
portable terminal 100 through the communication control 
unit 202 of the main body 200. 

In step S418, the portable terminal 100 receives the 
10 application result through the communication control unit 
102. 

In step S419, the portable terminal 100 outputs the 
application result from the input/output unit 101. When 
speech recognition is to be continued, the flow returns to 

15 step S412. 

In step S420, the communication is terminated. 
As described above, in the speech communication 
system of this embodiment, since speech recognition is 
performed by using a speech recognition model that adapts 
20 to the environment information of the portable terminal 100, 
optimal speech recognition can be executed in 
correspondence with each portable terminal. In addition, 
since communication conditions are set on the basis of 
environment information, communication efficiency can be 
25 improved in correspondence with each portable terminal. 

In this embodiment, in the case of noise, the average 
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and variance of parameters in a noise interval are obtained 
and sent to the main body to perform noise adaptation of 
a speech recognition model by the PMC technique . Obviously, 
however, another noise adaptation method can be used. In 
5 addition, according to the method described above, an 
average and variance are obtained on the terminal side and 
transmitted. However, speech information may be sent to 
the main body side to obtain an average and variance so as 
to perform noise adaptation. 

10 With regards to microphone characteristics, this 

embodiment has exemplified the method of obtaining the 
average and variance of parameters in a speech interval of 
a certain duration, sending them to the main body, and 
performing microphone characteristic adaptation of a 

15 speech recognition model by the CMS technique. Obviously, 
however, another microphone characteristic adaptation 
method can be used. In addition, according to the method 
described above, an average and variance are obtained on 
the terminal side and transmitted. However, speech 

20 information may be sent to the main body side to obtain an 
average and variance so as to perform noise adaptation. 

This embodiment has exemplified the speaker 
adaptation method of creating a simple phonemic model 
representing user's speech features in advance, sending it 

25 to the main body, and performing speaker adaptation of a 
speech recognition model . However, speech information may 
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be sent to the main body side to perform speaker adaptation 
by using speech on the main body side. Obviously, in this 
case as well, other various speaker adaptation methods can 
be used. 

In this embodiment, noise adaptation, microphone 
adaptation, and speaker adaptation are described 
independently. However, they can be properly combined and 
used. 

In this embodiment, the initialization mode is to be 
performed before the speech recognition mode. Once the 
initialization mode is completed, however, speech 
recognition can be resumed from the speech recognition mode 
under the same environment. In this case, the previous 
environment information is stored on the portable terminal 
100 side, and environment information created in resuming 
speech recognition is compared with the stored information. 
If no change is detected, the corresponding notification 
is sent to the main body 200, or the main body 200 performs 
such determination on the basis of the sent environment 
information, thus executing speech recognition. 

In this embodiment, environment information is used 
for both speech recognition processing and an improvement 
in speech efficiency. Obviously, however, only one of 
these operations may be executed by using the environment 
information. 

Although the preferred embodiment of the present 



invention has been described above, the objects of the 
present invention are also achieved by supplying a storage 
medium, which records a program code of a software program 
that can realize the functions of the above-mentioned 
5 embodiments to the system or apparatus, and reading out and 
executing the program code stored in the storage medium by 
a computer (or a CPU or MPU) of the system or apparatus. In 
this case, the program code itself read out from the storage 
medium realizes the functions of the above-mentioned 

10 embodiments, and the storage medium which stores the program 
code constitutes the present invention. The functions of 
the above-mentioned embodiments may be realized not only by 
executing the readout program code by the computer but also 
by some or all of actual processing operations executed by 

15 an OS (operating system) running on the computer on the basis 
of an instruction of the program code. 

Furthermore, the functions of the above-mentioned 
embodiments may be realized by some or all of actual 
processing operations executed by a CPU or the like arranged 

20 in a function extension board or a function extension unit, 
which is inserted in or connected to the computer, after the 
program code read out from the storage medium is written in 
a memory of the extension board or unit. 

As many apparent widely different embodiments of the 

25 present invention can be made without departing from the 
spirit and scope thereof, it is to be understood that the 
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invention is not limited to the specific embodiments thereof 
except as defined in the claims. 
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WHAT IS CLAIMED IS: 

1. A speech input terminal for transmitting speech data 
to a speech recognition apparatus through a wire or wireless 
communication network comprising: 

5 speech input means; 

means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof; and 

communication means for transmitting the information 
10 to said speech recognition apparatus. 

2. The terminal according to claim 1, wherein the 
information is based on at least one of a characteristic of 
said speech input means, a noise characteristic, and a 
speaker characteristic. 

15 3. The terminal according to claim 1, further comprising 
means for, when a data conversion condition for 
communication based on the information is received from said 
speech recognition apparatus, converting the speech data on 
the basis of the conversion condition. 

20 4. The terminal according to claim 1, further comprising : 
means for storing the information; 

means for determining whether there has been a change 
in the information in each communication; and 

means for, when there has been no change in the 
25 information, notifying said speech recognition apparatus of 
the corresponding information. 
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5. The terminal according to claim 1, wherein 

said terminal further comprises means for creating a 
speech recognition model on the basis of the information, 
and 

5 said communication means transmits the information 

and/or the speech recognition model to said speech 
recognition apparatus . 

6. A speech recognition apparatus comprising: 
speech recognition means for executing speech 

10 recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network; and 

means for receiving information for speech 
recognition, which is unique to said speech input terminal 

15 or represents an operation state thereof from said speech 
input terminal, wherein said speech recognition means 
executes speech recognition processing on the basis of the 
information . 

7. A speech recognition apparatus for executing speech 
20 recognition processing for speech data transmitted from a 

speech input terminal through a wire or wireless 
communication network comprising: 

means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
25 an operation state thereof, on the basis of the transmitted 
speech data; and 



- 28 - 



means for executing speech recognition processing on 
the basis of the information. 

8 . The apparatus according to claim 6, further comprising 
means for creating a speech recognition model on the basis 

5 of the information. 

9 . The apparatus according to claim 7 , further comprising 
means for creating a speech recognition model on the basis 
of the information. 

10. A speech recognition apparatus for executing speech 
10 recognition processing for speech data transmitted from a 

speech input terminal through a wire or wireless 
communication network comprising: 

means for receiving information for speech 
recognition, which is unique to said speech input terminal 
15 or represents an operation state thereof from said speech 
input terminal; 

means for determining a data conversion condition for 
communication on the basis of the information; and 

means for transmitting the data conversion condition 
20 to said speech input terminal. 

11. A speech recognition apparatus for executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising: 

25 means for creating information for speech recognition, 

which is unique to said speech input terminal or represents 
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an operation state thereof, on the basis of the transmitted 
speech data; 

means for determining a data conversion condition for 
communication on the basis of the information; and 
5 means for transmitting the data conversion condition 

to said speech input terminal. 

12. The apparatus according to claim 10, wherein the data 
conversion condition includes a data conversion condition 
based on a quantization table created on the basis of the 

10 information. 

13. The apparatus according to claim 11, wherein the data 
conversion condition includes a data conversion condition 
based on a quantization table created on the basis of the 
information. 

15 14 . The apparatus according to claim 6, further comprising 
means for, when said speech input terminal comprises a 
plurality of speech input terminals, storing the information 
in correspondence with to each of said speech input 
terminals . 

20 15 . The apparatus according to claim 7, further comprising 
means for, when said speech input terminal comprises a 
plurality of speech input terminals, storing the information 
in correspondence with to each of said speech input 
terminals . 

25 16. The apparatus according to claim 10, further 

comprising means for, when said speech input terminal 
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comprises a plurality of speech input terminals, storing the 
information in correspondence with each of said speech input 
terminals . 

17. The apparatus according to claim 11, further 
5 comprising means for, when said speech input terminal 

comprises a plurality of speech input terminals, storing the 
information in correspondence with each of said speech input 
terminals . 

18 . The apparatus according to claim 8, further comprising 
10 means for, when said speech input terminal comprises a 
plurality of speech input terminals, storing the speech 
recognition model in correspondence with each of said speech 
input terminals. 

19. The apparatus according to claim 10, further 

15 comprising means for, when said speech input terminal 

comprises a plurality of speech input terminals, storing the 
data conversion condition in correspondence with each of 
said speech input terminals. 

20. The apparatus according to claim 11, further 

20 comprising means for, when said speech input terminal 

comprises a plurality of speech input terminals, storing the 
data conversion condition in correspondence with each of 
said speech input terminals. 

21. A speech communication system comprising a speech 
25 input terminal and a speech recognition apparatus which can 

communicate with each other through a wire or wireless 
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communication network wherein 

said speech input terminal comprises 
speech input means , 

means for creating information for speech recognition, 
5 which is unique to said speech input terminal or represents 
an operation state thereof, and 

communication means for transmitting the information 
to said speech recognition apparatus, and 

said speech recognition apparatus comprises 
10 means for executing speech recognition processing on 

the basis of the information. 

22. A speech communication system comprising a speech 
input terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 

15 communication network wherein 

said speech recognition apparatus comprises 
means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof, on the basis of speech data from 
2 0 said speech input terminal, and 

means for executing speech recognition processing on 
the basis of the information. 

23. A speech communication system comprising a speech 
input terminal and a speech recognition apparatus which can 

25 communicate with each other through a wire or wireless 
communication network wherein 
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said speech input terminal comprises 
speech input means, 

means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof, and 

communication means for transmitting the information 
to said speech recognition apparatus, and 

said speech recognition apparatus comprises 

means for determining a data conversion condition for 
communication on the basis of the information, and 

means for transmitting the data conversion condition 
to said speech input terminal. 

24. A speech communication system comprising a speech 
input terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network wherein 

said speech recognition apparatus comprises 

means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof, on the basis of speech data from 
said speech input terminal, 

means for determining a data conversion condition for 
communication on the basis of the information, and 

means for transmitting the data conversion condition 
to said speech input terminal. 

25. A speech communication method of transmitting speech 



data from a speech input terminal to a speech recognition 
apparatus through a wire or wireless communication network 
comprising : 

in the speech input terminal, 
5 the step of creating information for speech 

recognition, which is unique to said speech input terminal 
or represents an operation state thereof; and 

the step of transmitting the information to the speech 
recognition apparatus. 
10 26. A speech communication method of executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising: 

the step of receiving information for speech 
15 recognition, which is unique to said speech input terminal 
or represents an operation state thereof from the speech 
input terminal; and 

the step of executing speech recognition processing 
on the basis of the information. 
20 27. A speech communication method of executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising: 

the step of creating information for speech 
25 recognition, which is unique to said speech input terminal 
or represents an operation state thereof, on the basis of 
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data transmitted from the speech input terminal; and 

the step of executing speech recognition processing 
on the basis of the information. 

28. A speech communication method of executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising: 

the step of receiving information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof from the speech 
input terminal; 

the step of determining a data conversion condition 
for communication on the basis of the information; and 

the step of transmitting the data conversion condition 
to the speech input terminal. 

29. A speech communication method of executing speech 
recognition processing for speech data transmitted from a 
speech input terminal through a wire or wireless 
communication network comprising: 

the step of creating information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof, on the basis of 
data transmitted from the speech input terminal; 

the step of determining a data conversion condition 
for communication on the basis of the information; and 

the step of transmitting the data conversion condition 



to the speech input terminal. 

30. A speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
5 communication network comprising: 

in the speech input terminal, 
the step of creating information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof; and 
10 the step of transmitting the information to the speech 

recognition apparatus, and 

in the speech recognition apparatus, 
the step of executing speech recognition processing 
on the basis of the information. 
15 31. A speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network comprising: 

in the speech recognition apparatus, 
20 the step of creating information for speech 

recognition, which is unique to said speech input terminal 
or represents an operation state thereof, on the basis of 
speech data from the speech input terminal; and 

the step of executing speech recognition processing 
25 on the basis of the information. 

32. A speech communication method between a speech input 
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terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network comprising: 

in the speech input terminal, 

the step of creating information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof; and 

the step of transmitting the information to the speech 
recognition apparatus, and 

in the speech recognition apparatus, 

the step of determining a data conversion condition 
for communication on the basis of the information; and 

the step of transmitting the data conversion condition 
to the speech input terminal. 

33. A speech communication method between a speech input 
terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
communication network comprising: 

in the speech recognition apparatus, 

the step of creating information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof, on the basis of 
speech data from the speech input terminal; 

the step of determining a data conversion condition 
for communication on the basis of the information; and 

the step of transmitting the data conversion condition 



to the speech input terminal. 

34. A storage medium recording a program for, in order to 
transmit speech data from a speech input terminal to a speech 
recognition apparatus through a wire or wireless 

5 communication network, causing a computer to function as 
means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof, and 

communication means for transmitting the information 
10 to said speech recognition apparatus. 

35. A storage medium recording a program for, in order to 
execute speech recognition processing on the basis of speech 
data sent from a speech input terminal through a wire or 
wireless communication network, causing a computer to 

15 function as 

means for receiving information for speech 

recognition, which is unique to said speech input terminal 

or represents an operation state thereof from said speech 

input terminal; and 
2 0 means for executing speech recognition processing on 

the basis of the information. 

36. A storage medium recording a program for, in order to 
execute speech recognition processing on the basis of speech 
data sent from a speech input terminal through a wire or 

25 wireless communication network, causing a computer to 
function as 
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means for creating information for speech recognition, 
which is unique to said speech input terminal or represents 
an operation state thereof, on the basis of the speech data 
transmitted from said speech input terminal, and 
5 means for executing speech recognition processing on 

the basis of the information. 

37. A storage medium recording a program for, in order to 
execute speech recognition processing on the basis of speech 
data sent from a speech input terminal through a wire or 
10 wireless communication network, causing a computer to 
function as 

means for receiving information for speech 
recognition, which is unique to said speech input terminal 
or represents an operation state thereof from said speech 
15 input terminal; and 

means for determining a data conversion condition for 
communication on the basis of the information, and 

means for transmitting the data conversion condition 
to said speech input terminal. 
20 38. A storage medium recording a program for, in order to 
execute speech recognition processing on the basis of speech 
data sent from a speech input terminal through a wire or 
wireless communication network, causing a computer to 
function as 

25 means for creating information for speech recognition, 

which is unique to said speech input terminal or represents 
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an operation state thereof, on the basis of the speech data 
transmitted from said speech input terminal, 

means for determining a data conversion condition for 
communication on the basis of the information, and 
5 means for transmitting the data conversion condition 

to said speech input terminal. 
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ABSTRACT OF THE DISCLOSURE 
A speech communication system comprising a speech 
input terminal and a speech recognition apparatus which can 
communicate with each other through a wire or wireless 
5 communication network wherein the speech input terminal 
comprises speech input unit, a unit for creating environment 
information for speech recognition, which is unique to the 
speech input terminal or represents its operation state, and 
a communication control unit for transmitting the 
10 environment information to the speech recognition apparatus, 
and the speech recognition apparatus executes speech 
recognition processing on the basis of the environment 
information. 



- 41 - 



1/2 




2/2 

F I G. 



100 

PORTABLE TERMINAL 



200 
MAIN BODY 



S401 



START 



RECEIVE ENVIRONMENT 
INFORMATION 






SEND MESSAGE 







S404 



PERFORM ACOUSTIC ANALYSIS 



S405- 



S407 



CREATE ENVIRONMENT MODEL 
(AVERAGE, VARIANCE, AND THE LIKE) 



S406- 



SEND ENVIRONMENT 
INFORMATION TO MAIN BODY 




OBTAIN ENVIRONMENT 
INFORMATION / MODEL 



S408 



I 



PERFORM ENVIRONMENT ADAPTATION 
OF SPEECH RECOGNITION MODEL 

S 409 ^ | 



S411 



S410 



OBTAIN AND HOLD SQ TABLE 



CREATE SQ TABLE FOR SPEECH 
RECOGNITION CO MMUNICATION 



SEND SQ TABLE TO TERMINAL 



S412 



INPUT SPEECH 



S413 



PERFORM ACOUSTIC ANALYSIS 



S414 



SQ-ENCODE PARAMETERS BY 
USING SQ TABLE AND SEND 
RESULTANT DATA TO MAIN BODY 



S415 



OBTAIN AND DECODE PARAMETERS 
TO PERFORM SPEECH RECOGNITION 



S416 



OPERATE APPLICATION IN 
ACCORDANCE WITH 
RECOGNITION RESULT 



S418^ 



S417 



OUTPUT APPLICATION RESULT 



S419 



SEND APPLICATION RESULT TO 
TERMINAL 



OBTAIN APPLICATION RESULT 



S420 



END 



COMBINED DECLARATION AND POWER OF ATTORNEY 
FOR PATENT APPLICATION 

(Page 1) 

As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name; 

I believe I am the original, first and sole inventor (if only one name is listed below) or an 
original, first and joint inventor (if plural names are listed below) of the subject matter which is claimed 
and for which a patent is sought on the invention entitled 

SPEECH INPUT TERMINAL , SPEECH RECOGNITION APPARATUS , 

SPEECH COMMUNICATION SYSTEM, AND SPEECH COMMUNICATION 

METHOD 

the specification of which [ x ]is attached hereto. [ ] was filed on 

as United States Application No. or PCT International Application No. 

and was amended on (if applicable). 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to patentability as defined in 37 
CFR §1.56. 

I hereby claim foreign priority benefits under 35 U.S.C. § 1 19(a)-(d) or § 365(b), of any foreign 
application(s) for patent or inventor's certificate, or § 365(a) of any PCT international application which 
designates at least one country other than the United States, listed below and have also identified below 
any foreign application for patent or inventor's certificate, or PCT international application having a 
filing date before that of the application on which priority is claimed: 

(Yes/No) 

Country Application No. Filed (Day/Mo./Yr.) Priority Claimed 

JAPAN 11-260760 14/09/1999 Yes 

I hereby appoint the practitioners associated with the firm and customer number provided below 
to prosecute this application and to transact all business in the Patent and Trademark Office connected 
therewith, and direct that all correspondence be addressed to the address associated with that Customer 
Number: 

FITZPATRICK, CELLA, HARPER & SCINTO 
Customer Number: 05514 



I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these statements were 
made with the knowledge that willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful 
false statements may jeopardize the validity of the application or any patent issued thereon. 



COMBINED DECLARATION AND POWER OF ATTORNEY 
FOR PATENT APPLICATION 

(Page 2) 



Full Name of Sole or First Inventor , Yasuhiro KOMORI 

Inventor's signature M^SSaU^ ^ t ,^ r f 

Date -Sfajff^be^ ^ .^-o^o Atizen/s;nhjprt of Japan 

Residence 412-9. Ch itose, Takatsu-ku. Kawasaki-shi , 

Kanaaawa - ken , Japan 

Post Office Address c/o CANON KABUSHIKI kaisha. 

30-2 r Shimoma ruko 3-chome, Ohta-ku. Tokyo, Japan 



Full Name of Second Joint Inventor, if any Masavuki YAMADA 

Inventor's signature Oi^+t+M L^^. 

Date 5if£iU*\ ^ t^oo Citizen/Subject of Japan 

Residence 30-51-103, Shukuaawara 4-chome. Tama-ku. 

Kawasaki - shi , Kanacrawa-ken , Japan 

Post Office Address c/o CANON KABUSHIKI KAISHA. 

30-2, Shimomaru ko 3-chome, Ohta-ku. Tokyo, Japan 



F511/A601948/ald 



